NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The case for accurate lifetime accounting in carbon metrics

https://doi.org/10.1145/3764944.3764965

Huang, Yujin; Zhu, Timothy; Gandhi, Anshul (August 2025, ACM SIGMETRICS Performance Evaluation Review)

To represent the entire carbon footprint of computing devices, carbon metrics often include both an embodied cost (i.e., carbon cost to produce the device) and an operational cost (i.e., carbon cost to run the device). The embodied carbon cost is typically high, but it is amortized over the lifetime of the device. In this vision statement, we argue that for carbon metrics to be useful, we need (i) accurate metrics for lifetime, which are challenging for SSDs, and (ii) correct reasoning about carbon costs when using such metrics.
more » « less
Free, publicly-accessible full text available August 26, 2026
Kneeliverse: A universal knee-detection library for performance curves

https://doi.org/10.1016/j.softx.2025.102161

Antunes, Mário; Estro, Tyler; Bhandari, Pranav; Gandhi, Anshul; Kuenning, Geoff; Liu, Yifei; Waldspurger, Carl; Wildani, Avani; Zadok, Erez (May 2025, SoftwareX)

Identifying knee and elbow points in performance curves is a critical task in various domains, including machine learning and system design. These points represent optimal trade-offs between cost and performance, facilitating efficient decision-making and resource allocation. However, accurately determining the knees and elbows in curves poses a significant challenge. To address this challenge, we introduce Kneeliverse, an open-source library dedicated to knee/elbow point detection. Kneeliverse incorporates a suite of well-established knee-detection algorithms, including Menger, L-method, Kneedle, and DFDT. Additionally, Kneeliverse extends these algorithms to detect multiple knees and elbows in complex curves, employing a recursive approach. Kneeliverse further includes Z-Method, a recently developed algorithm specifically designed for multi-knee detection.
more » « less
Free, publicly-accessible full text available May 1, 2026
Investigating WebRTC BBR as an alternative to GCC for live video streaming

Drucker, Rebecca; Gandhi, Anshul; Balasubramanian, Aruna (December 2024, IEEE)

Full Text Available
EcoEdgeInfer: Dynamically Optimizing Latency and Sustainability for Inference on Edge Devices

https://doi.org/10.1109/SEC62691.2024.00023

Rachuri, Sri Pramodh; Shaik, Nazeer; Choksi, Mehul; Gandhi, Anshul (December 2024, IEEE)

Full Text Available
KACE: Kernel-Aware Colocation for Efficient GPU Spatial Sharing

https://doi.org/10.1145/3698038.3698555

Han, Bing-Shiun; Paul, Tathagata; Liu, Zhenhua; Gandhi, Anshul (November 2024, ACM)

Full Text Available
OVIDA: Orchestrator for Video Analytics on Disaggregated Architecture

https://doi.org/10.1109/SEC62691.2024.00019

Singh, Manavjeet; Rachuri, Sri Pramodh; Cao, Bryan Bo; Sharma, Abhinav; Bhumireddy, Venkata; Bronzino, Francesco; Das, Samir R; Gandhi, Anshul; Jain, Shubham (December 2024, IEEE)

Full Text Available
GAMMA: Graph Neural Network-Based Multi-Bottleneck Localization for Microservices Applications

https://doi.org/10.1145/3589334.3645665

Somashekar, Gagan; Dutt, Anurag; Adak, Mainak; Lorido_Botran, Tania; Gandhi, Anshul (May 2024, ACM)

Full Text Available
Empirical Evaluation of ML Models for Per-Job Power Prediction

https://doi.org/10.1145/3629527.3651418

Halder, Debajyoti; Acharya, Manas; Malsane, Aniket; Gandhi, Anshul; Zadok, Erez (May 2024, ACM)

Sustainability has become a critical focus area across the technology industry, most notably in cloud data centers. In such shared-use computing environments, there is a need to account for the power consumption of individual users. Prior work on power prediction of individual user jobs in shared environments has often focused on workloads that stress a single resource, such as CPU or DRAM. These works typically employ a specific machine learning (ML) model to train and test on the target workload for high accuracy. However, modern workloads in data centers can stress multiple resources simultaneously, and cannot be assumed to always be available for training. This paper empirically evaluates the performance of various ML models under different model settings and training data assumptions for the per-job power prediction problem using a range of workloads. Our evaluation results provide key insights into the efficacy of different ML models. For example, we find that linear ML models suffer from poor prediction accuracy (as much as 25% prediction error), especially for unseen workloads. Conversely, non-linear models, specifically XGBoost and xRandom Forest, provide reasonable accuracy (7–9% error). We also find that data-normalization and the power-prediction model formulation affect the accuracy of individual ML models in different ways.
more » « less
Full Text Available
BBR vs. BBRv2: A Performance Evaluation

https://doi.org/10.1109/COMSNETS59351.2024.10427175

Drucker, Rebecca; Baraskar, Gauri; Balasubramanian, Aruna; Gandhi, Anshul (January 2024, IEEE)

Full Text Available
Accelerating multi-tier storage cache simulations using knee detection

https://doi.org/10.1016/j.peva.2024.102410

Estro, Tyler; Antunes, Mário; Bhandari, Pranav; Gandhi, Anshul; Kuenning, Geoff; Liu, Yifei; Waldspurger, Carl; Wildani, Avani; Zadok, Erez (May 2024, Performance Evaluation)

Storage cache hierarchies include diverse topologies, assorted parameters and policies, and devices with varied performance characteristics. Simulation enables efficient exploration of their configuration space while avoiding expensive physical experiments. Miss Ratio Curves (MRCs) efficiently characterize the performance of a cache over a range of cache sizes, revealing ‘‘key points’’ for cache simulation, such as knees in the curve that immediately follow sharp cliffs. Unfortunately, there are no automated techniques for efficiently finding key points in MRCs, and the cross-application of existing knee-detection algorithms yields inaccurate results. We present a multi-stage framework that identifies key points in any MRC, for both stack- based (e.g., LRU) and more sophisticated eviction algorithms (e.g., ARC). Our approach quickly locates candidates using efficient hash-based sampling, curve simplification, knee detection, and novel post-processing filters. We introduce Z-Method, a new multi-knee detection algorithm that employs statistical outlier detection to choose promising points robustly and efficiently. We evaluated our framework against seven other knee-detection algorithms, identifying key points in multi-tier MRCs with both ARC and LRU policies for 106 diverse real-world workloads. Compared to naïve approaches, our framework reduced the total number of points needed to accurately identify the best two-tier cache hierarchies by an average factor of approximately 5.5x for ARC and 7.7x for LRU. We also show how our framework can be used to seed the initial population for evolutionary algorithms. We ran 32,616 experiments requiring over three million cache simulations, on 151 samples, from three datasets, using a diverse set of population initialization techniques, evolutionary algorithms, knee-detection algorithms, cache replacement algorithms, and stopping criteria. Our results showed an overall acceleration rate of 34% across all configurations.
more » « less
Full Text Available

« Prev Next »

Search for: All records